I would like to introduce a tool for masking sensitive information in HAR files

I would like to introduce a tool for masking sensitive information in HAR files

Clock Icon2024.10.05

日本語版はこちら

Hello, this is Makoto Nakamura from Annotation.
Today, I would like to introduce a tool for masking sensitive information in HAR files.

Precaution

Please note that there is no guarantee that all sensitive information will be removed using the tool introduced in this article.
Be sure to check that no sensitive information remains before providing any information.

Tool Introduction

I will provide a link to the tool for now, and further details will be explained later.

https://github.com/google/har-sanitizer

https://github.com/scottmcmaster/harsanitizer-docker

https://github.com/cloudflare/har-sanitizer

What is a HAR File?

As mentioned in our company blog, HTTP Archive (HAR) files are JSON files that record the latest network activity in a browser.

https://dev.classmethod.jp/articles/tsnote-harfile-chrome-supportcase/

Since there are cases where obtaining and sharing HAR files is necessary when contacting AWS Support, they are also introduced in the AWS Knowledge Center.

https://repost.aws/knowledge-center/support-case-browser-har-file

HAR files, which are necessary for troubleshooting, may sometimes capture sensitive information such as passwords and cookies.
Both our technical support and AWS Support request that any sensitive information be masked before sharing the files provided by customers.

AWS official site

Again, remove any confidential information from them.

Bulk replacement can be challenging.

HAR files may contain numerous cookies and other sensitive information, and manually masking all of this information can be time-consuming.
Additionally, since the sensitive information in the HAR file's JSON may not follow a consistent format, as shown below, it can be challenging to perform replacements using a text editor.

{
  "name": "cookie",
  "value": "xxxxx"
}
{
  "cookies": [
    {
      "name": "cookie-name",
      "value": "xxxx",
      "path": "/",
      "domain": "example.xom",
      "expires": "yyyy-mm-ddThh:mm:ss.000Z",
      "httpOnly": true,
      "secure": true,
      "sameSite": "None"
    }
  ]
}

As an experiment, I obtained a HAR file from the AWS Management Console and searched for "cookie" in a text editor, resulting in 1,500 hits.
Manually masking all of these would be highly inefficient.

Of course, it is essential to mask sensitive information for security reasons, but what engineers want to focus on is development and operational tasks, not masking sensitive information.

Therefore, I investigated tools that can mask sensitive information in HAR files in bulk and found three tools, which I will introduce below.

Replacement Tool 1: HAR Sanitizer

The first tool is HAR Sanitizer.

https://github.com/google/har-sanitizer

This tool is available in Google's GitHub repository, but it is clearly stated within the repository that it is not an official Google product.

The usage is simple: just access the above GitHub and visit the live version at the following URL:

https://har-sanitizer.appspot.com/

When you access the above URL, the following screen will be displayed.

2022-09-15_14h59_02

When you click "LOAD HAR" in the upper right corner and select a HAR file, four options will be displayed: "COOKIES," "HEADERS," "URLQUERY/POSTDATA PARAMS," and "CONTENT MIMETYPES."

2022-09-15_14h25_01

2022-09-15_14h25_28

2022-09-15_14h25_33

2022-09-15_14h25_37

For each item, there is a checkbox, and by turning on the checkbox, the corresponding item will be masked.

As a test, use a dummy HAR file, select "All Cookies" under "COOKIES," and download the HAR file.

2022-09-15_14h29_08-960x669

2022-09-15_14h29_57

When you check the values of the cookies in the downloaded HAR file, you will see that they have been masked as shown below.

{
  "cookies": [
    {
      "name": "1P_JAR",
      "value": "[1P_JAR redacted]",
      "path": "/",
      "domain": "example.xom",
      "expires": "yyyy-mm-ddThh:mm:ss.000Z",
      "httpOnly": true,
      "secure": true,
      "secure": true,
      "sameSite": "None"
    }
  ]
}

It appears that the tool replaces the cookie values with "[cookie-name redacted]."
Other sensitive information is also masked with the same value.

This is quite convenient!

However, a concern with HAR Sanitizer is the risk associated with uploading HAR files containing sensitive information to a website.
If the contents of the HAR file are stolen by a third party, there is a danger of unauthorized access to the AWS environment.

Therefore, I investigated tools that can be executed in a local environment and found the second tool, which I will introduce next.

Replacement Tool 2: harsanitizer-docker

As the name suggests, it is a tool that allows you to use the previously introduced HAR Sanitizer in a Docker environment.

https://github.com/scottmcmaster/harsanitizer-docker

I think this is a GitHub repository from an individual developer, not Google.
However, I have confirmed that it works, so I will introduce it.

As a prerequisite, you need to install Docker, so please refer to the Docker documentation for the installation process.

The contents of harsanitizer-docker are the same as HAR Sanitizer, so all you need to do is follow the README and execute the docker run command.

$ docker run -d -p 8080:8080 scottmcmaster/harsanitizer:1.1

Once the container is started, access the localhost on port 8080.

http://localhost:8080

As confirmed below, HAR Sanitizer can be accessed in the local environment as well.

2022-09-15_14h48_28-1-960x55

After that, follow the same steps as with HAR Sanitizer.
This way, you can mask sensitive information without uploading HAR files to a website.

For reference, here are the steps to launch it in Cloud9.

  1. Create a Cloud9 EC2 environment.
  2. In the terminal, run the command docker run -d -p 8080:8080 scottmcmaster/harsanitizer:1.1.
  3. After the command execution is complete, click on "Preview Running Application".
  4. Although HAR Sanitizer can be used in the preview state, if you want to enlarge the screen, click the icon in the upper right corner to open it in a separate tab.

2023-07-12_09h10_22-1

2023-07-12_09h12_04-1

2023-07-12_09h13_40

Replacement Tool 3: cloudflare/har-sanitizer

Cloudflare also provides har-sanitizer on their GitHub.

https://github.com/cloudflare/har-sanitizer

Here are the necessary steps:

$ git clone https://github.com/cloudflare/har-sanitizer.git
$ cd har-sanitizer
$ npm run dev

If you encounter any errors related to concurrently, please ensure that concurrently is also installed.

$ npm i concurrently
$ npm run dev

If the execution is successful, a local URL will be displayed as shown below. Access this URL:

➜  Local:   http://127.0.0.1:3000/

2024-10-05_13h56_38

As a side note

I wasn't the only one investigating methods to mask sensitive information in HAR files in bulk.

https://stackoverflow.com/questions/65853586/how-to-remove-cookie-value-from-har-file-before-sharing-it-with-another-person

https://issues.chromium.org/issues/40441005

The reference site mentioned above also introduced a method to replace information using commands.
If you prefer using commands, please refer to the reference site.

Summary

In this article, I introduced a tool for masking sensitive information in HAR files.

Since HAR files can sometimes contain sensitive information unexpectedly, it's important to use masking tools effectively to protect company and personal information.

We hope you find this article helpful.

References

アノテーション株式会社について

アノテーション株式会社はクラスメソッドグループのオペレーション専門特化企業です。サポート・運用・開発保守・情シス・バックオフィスの専門チームが、最新 IT テクノロジー、高い技術力、蓄積されたノウハウをフル活用し、お客様の課題解決を行っています。当社は様々な職種でメンバーを募集しています。「オペレーション・エクセレンス」と「らしく働く、らしく生きる」を共に実現するカルチャー・しくみ・働き方にご興味がある方は、アノテーション株式会社 採用サイトをぜひご覧ください。

Share this article

facebook logohatena logotwitter logo

© Classmethod, Inc. All rights reserved.